An Information-Theoretic Analysis 
of the Security of Communication Systems 
Employing the Encoding-Encryption Paradigm 

Frederique Oggier and Miodrag J. Mihaljevic * 
August 6, 2010 



Abstract 

This paper proposes a generic approach for providing enhanced security to communication systems 
which encode their data for reliability before encrypting it through a stream cipher for security. We 
call this counter-intuitive technique the encoding-encryption paradigm, and use as motivating example 
the standard for mobile telephony GSM. The enhanced security is based on a dedicated homophonic or 
wire-tap channel coding that introduces pure randomness, combined with the randomness of the noise 
occurring over the communication channel. Security evaluation regarding recovery of the secret key 
employed in the keystream generator is done through an information theoretical approach. 

We show that with the aid of a dedicated wire-tap encoder, the amount of uncertainty that the ad- 
versary must face about the secret key given all the information he could gather during different passive 
or active attacks he can mount, is a decreasing function of the sample available for cryptanalysis. This 
means that the wire-tap encoder can indeed provide an information theoretical security level over a pe- 
riod of time, but after a large enough sample is collected the function tends to zero, entering a regime in 
which a computational security analysis is needed for estimation of the resistance against the secret key 
recovery. 

Keywords: error-correction coding, security evaluation, stream ciphers, randomness, wireless communi- 
cations, homophonic coding, wire-tap channel coding. 



1 Introduction 

Most communication systems take into account not only the reliability but also the security of the data 
they transmit. This is particularly true in wireless environment, where the data is inherently more sensible 
to security threats. Consequently, the design of such systems need to include both coding schemes for 
providing error-correction and ciphering algorithms for encryption-decryption. It is common practice to 
first encrypt the data to ensure its safety, and then to encode it for reliability. In this paper, we consider 
the reverse scenario, namely systems which first encode the data, and then encrypt it, which we call the 
encoding- encryption paradigm. 

Though counter-intuitive at first, there are actually many real life applications where the encoding encryp- 
tion paradigm is used. A famous illustrative example is the most widespread standard for mobile telephony 
GSM, standing for "Global System for Mobile Communications" (see [2] and p], for the coding, respectively 
security details). In the GSM protocol, the data is first encoded using an error-correction code so as to 
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withstand reception errors, which considerably increases the size of the message to be transmitted. The 
encoded data is then encrypted to provide privacy (secrecy of the communications) for the users. 

It is interesting to mention that block ciphers are not suitable in the context of the encoding-encryption 
paradigm, since the receiver needs to first decrypt the data despite the noise, before performing the decoding. 
This leads to use of stream ciphers and thus when we refer to the security of systems using the encoding- 
encryption paradigm, we implicitly mean the security of the keystream generator and the users' secret key. 

From a security perspective, there are of course pros and cons to the encoding-encryption paradigm. Since 
it implies encryption of redundant data (introduced by error-correction), it could be an origin for mounting 
attacks against the employed keystream generator. Undesirability of redundant data from a cryptographic 
security point of view has indeed been already pointed out in the seminal work by Shannon [18] . where 
cryptography as a scientific topic has been established. On the other hand, the encoding-encryption paradigm 
has the advantage to offer protection in the case of a known plaintext attacking scenario, since an adversary 
can only learn a noisy version of the keystream, which makes the cryptanalyis of the employed keystream 
generator more complex. 

Security evaluation can be performed under two attacking scenarios, depending on whether one considers 
an active or passive adversary. 

A passive adversary's ability is limited to monitoring (and recording) communications between the le- 
gitimate parties, so as to use the recorded data as input for mounting a known plaintext attack against the 
considered system. 

Stronger attacks come from active adversaries, which can possibly include many attacking settings. In this 
paper, we consider active attacks motivated by the class of so-called Hopper and Blum (HB) authentication 
protocols [S],[H] , [ID], HO, [5]. Following the original work by [8], HB authentication protocols are challenge- 
response based, where the response could be considered as the encoded and encrypted version of the challenge, 
which is deliberately degraded by random noise. A simple active attack on the improved HB + authentication 
protocol [ID] was provided in [4] , where it is assumed that an adversary can manipulate challenges sent during 
the authentication exchange, and thus learn whether such manipulations give an authentication failure. The 
attack consists of choosing a constant vector and using it to perturb the challenges by computing the XOR 
of the selected vector with each authentication challenge vector, and that for each of the authentication 
rounds. To summarize, the active attacker has the following abilities: (i) he can modify the data in the 
communication channel between the legitimate parties; and (ii) he can can learn the effect of the performed 
modification at the receiving side. This is the model that will be adopted in this work. 

To evaluate the security of systems using the encoding-encryption paradigm under threats of both passive 
and active adversaries as described above, both computational and information theoretical analyses are valid. 
In this paper, we focus on the latter. We propose a security enhanced approach which employs a dedicated 
coding, following the frameworks of homophonic [H] H2] E] and wire-tap channel coding [2D1 US]- The 
improved security is a consequence of combining the pure randomness introduced by the wire-tap coding 
and the random noise which is inherent in the communication channel. 

We measure the security increase with respect to the secret key in terms of its equivocation, that is 
the amount of uncertainty that the adversary has on the key, given all the information he can collect. A 
preliminary study of the security enhancement has been provided in |16) in the case of a passive adversary. 
The enhancement is based on the constructions reported in [TJ] US] , and also motivated by the fact that in the 
computational complexity evaluation scenarios, this approach provides resistance against the generic time- 
memory trade-off based attacking approaches [7] 113] , and particular powerful techniques like the correlation 
attacks [3]. 

Motivation for the Work. The aim of this work is to propose and elaborate a model for the security evaluation 
of communication systems which employ the encoding-encryption paradigm together with a dedicated wire- 
tap encoder for security enhancement. In a general security evaluation scenario, both passive and active 
attacks should be treated, and while the enhanced system should be resistant to these, it should be with a 
slight /moderate increase of the implementation complexity and the communications overhead. It may be 
worth emphasizing that our target is to increase the security of existing schemes, such as GSM, which is 
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why we have a small margin of freedom in designing the security scheme, since we cannot touch most of the 
existing components of the system. 



Summary of the Results. This paper proposes and analyzes from the information-theoretic point of view the 
security of communications systems based on the encoding-encryption paradigm under passive and active 
attacks, when equipped with an additional wire-tap encoder. We show that with the aid of a dedicated 
wire-tap encoder, the amount of uncertainty that the adversary must face about the secret key given all the 
information he could gather during different passive or active attacks he can mount, is a decreasing function 
of the sample available for cryptanalysis. This means that the wire-tap encoder can indeed provide an 
information theoretical security level over a period of time, but after a large enough sample is collected the 
function tends to zero, entering a regime in which a computational security analysis is needed for estimation 
of the resistance against the secret key recovery. 

Organization of the Paper. In Section [5J we start by describing precisely the system model together with 
its security enhanced version and we dedicate Subsection 12.21 to the design of the wire-tap encoder. The 
security analysis is done in two parts: first the passive adversary is studied in Section [3J while the active 
one is investigated in Section 01 Practical implications of the given security analysis and some guidelines 
for design of security enhanced encoding-encryption based systems are pointed out in Section [SJ Concluding 
remarks including some directions for future work are given in Section [5] 



2 System Model and Wiretap Coding 

We consider a class of communication systems which, to provide both reliability and security, employs the 
encoding and then encryption paradigm, namely: the message is first encoded, and then encrypted using a 
stream ciphering. 
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Figure 1: Communication system model. 



The detailed model is shown in Figure [T] The transmitter first encodes a binary message/plain text 

a=N™ie{0,l} m 

using an error-correcting code Cecc 

b = C £CC (a) = [de{0,l}", 



that maps a m-dimensional plain text to an n-dimensional encoded message, n > m. The encryption is done 
using a keystream generator, which takes as input the secret key k of the transmitter, and outputs 

x = x(k) - [ Xi ]U 6 {0, 1}" 

yielding 

y = y(k) = C BCC (a) © x = [y^ =1 g {0, 1}" (1) 

as the message to be sent over the noisy channel, where © denotes XOR or modulo 2 addition. We denote 
the noise vector by 

v = [^ =1 e{o,i}" 

where each Vi is the realization of a random variable Vt such that Pr(Vi = 1) = p and Pr(Vi = 0) = 1 — p. 
Upon reception of the corrupted encrypted binary sequence of ciphertext 

z = z(k) 

= y + v 

= c BCC (a)exev = e{o,i} n , 

the receiver who shares the secret key k with the transmitter can decrypt first the message 

{C E cc(a) © x © v) © x = C E cc(a) © v G {0, 1}™, 

and then decode a despite of the noise thanks to the error-correction code. We remark that in practice a 
keystream generator can be considered as a finite state machine whose initial state is determined by the 
secret key and some public data. For simplicity, and because it does not affect our analysis, we can ignore 
the existence of the known data, and focus on the secret key. In this setting the output of the keystream 
generator is determined uniquely by the secret key, and it is enough to assume that the transmitter and 
receiver only share the key. 

Note further that the trick of reversing the order of encryption and error-correction would not have been 
possible if a block cipher was used for encryption, since decryption must be done before removing the channel 
noise. 

We finally assume that there is a noiseless feedback link that connects the receiver to the transmitter, so 
that the receiver can either acknowledge the reception of the message, or inform of the decoding failure, so 
as to get the missing message sent back. 

2.1 Enhanced model 

Origins for the construction given in this paper are the approaches for stream ciphers design recently reported 
in [T31 [H], though the focus of this paper is very different, since its goal is enhancing the security of 
existing encryption schemes. This difference has a number of implications regarding the security issues and 
implementation complexity of the scheme. 

The construction proposed in this paper employs the following main underlying ideas for enhancing 
security: 

• Involve pure randomness into the coding&ciphering scheme so that the decoding complexity without 
knowledge of the secret key employed in the system approaches the complexity of the exhaustive search 
for the secret key. 

• Enhance security of the existing stream cipher via joint employment of pure randomness and coding 
theory, and particularly a dedicated encoding following the homophonic or wire-tap channel encoding 
approaches. 
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• Allow a suitable trade-off between the security and the communications rate: Increase the security 
towards the limit implied by the secret-key length at the expense of a low-moderate decrease of the 
communications rate. 

Regarding the homophonic and wire-tap channel coding, note the following. The main goals of homo- 
phonic coding are to provide: (i) multiple substitutions of a given source vector via randomness so that 
the coded versions of the source vectors appear as realizations of a random source; (ii) recoverability of the 
source vector based on the given codeword without knowledge of the randomization. The main goals of 
wire-tap channel coding are: (i) amplification of the noise difference between the main and wire-tap channel 
via randomness; (ii) a reliable transmission in the main channel and at the same time to provide a total 
confusion of the wire-tapper who observes the communication in the main channel via a noisy channel (wire- 
tap channel). Accordingly, homophonic coding schemes and wire-tap channel ones have different goals and 
belong to different coding classes, the source coding and the error-correction ones, but they employ the same 
underlying ideas of using randomness and dedicated coding for achieving the desired goals. 

For enhancing the security we exploit the underlying approaches of universal homophonic coding |12) and 
generic wire-tap coding when the main channel is error- free (see |20j and |19) . for example). Accordingly, 
we may say either "homophonic coding" or "wire-tap channel coding" to address the dedicated coding that 
enhances security. The main feature of the dedicated coding is that the encoding is based on randomness 
and that the legitimate receiving party who shares a secret key with the corresponding transmitting one 
can perform decoding without knowledge of the randomness employed for the encoding. For simplicity of 
the terminology we mainly (but not always) say "wire-tap channel coding" to describe the dedicated coding 
which provides the enhanced security. 

Let Ch(') denote a wiretap or homophonic code encoder. To enhance the security of the system consid- 
ered, it is added at the transmitter end (see Figure [2]) involving a vector of pure randomness 

u - N™f e {o, i} m -\ 

that is, each «, is the realization of a random variable Ui with distribution Pr([/j = 1) = Pr(EZj = 0) = 1/2. 
Note that Cjj(-) is invertible. The wiretap encoding is done prior to error-correcting encoding, thus out of 
the m bits of data to be sent, m — I are replaced by random data, letting actually only I bits 

a = MLi e {o, i} ; 

of plaintext, to get as in ((T|) 

y = y(k) = C B cc(CH(a||u))®x (2) 
as codeword to be sent. As before, the receiver obtains 

z = z(k) = y©v = C BC 7C7(CH(a||u))®x®v (3) 

and starts with the decryption 

y = (Cecc (Ch (a| |u)) ®x®v) ©x = C E cc (C H (a| |u)) © v. 

He then first decodes 

C H (a||u). 

If the decoding is successful, he computes a using C H l and let the transmitter know he could decode. 
Otherwise he informs the transmitter than retransmission is required. 

Similarly to a linear error-correction code where Cecc can be represented by multiplying the data vector 
by the generator matrix of the code, we can write Ch, following the so-called coset encoding proposed by 
Wyner [20], as follows: 

hi 



Ck(a||u) = [a||u] 



h 2 

h, 
G 



c 



[a||u]G ff , (4) 
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Figure 2: Communication system model enhanced with a wire-tap encoder. 



where 

• G c is a (m — I) x m generator matrix for a (m, m — I) linear error-correction code C with rows 
Si\ E21 ■ ■ ■ i Sm-i' 

• hi, h2, . . . , h/ are I linearly independent row vectors from {0, l} m \C, 

• and Gh is a hi x ra binary matrix corresponding to Cr (•)■ 

In words, to each Z-bit message a = [ai, . . . , a{\ is associated a coset determined by 

a ^ aihi © a 2 h 2 © ... © a;h; © C. 
Though this correspondence is deterministic, a random codeword c is chosen inside the coset by: 

c = aihi © a 2 h 2 © ... © a ; h ; © u lg f © uzgo 7 © ... © u m _ Jg £_, 
where u = [ui, u 2 , • • • , u m -i] is a uniformly distributed random (m — i)-bit vector. 

2.2 A Dedicated Wiretap Encoder 

In our scenario, we need to combine wiretap encoding with error-correction encoding, both being linear 
operations. Recall that the encoded vector at the transmitter is 

Cecc(Ch(b\\u)), 

where a is a Z-dimensional data vector, and u is a m — I random vector. Using generic coset coding as 
discussed above with a (to, to — I) code, we now know that 

C ff (a||u) = [a||u]G ff , 

where Gh is an m x to matrix, and thus 

C EC c{Ch{b\\u)) = C E cc([a\\u}G H ) 
= [ a ll u ]GffG E cc 

= [a||u]G (5) 
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where Gecc is an to x n binary generator matrix corresponding to Cecc('), and G = GhGecc is an 
to x n binary matrix summarizing the two successive encodings at the transmitter. 

Since Gh multiplies the vector [a| |u] where a is an /-dimension vector and u an (m — I) dimension vector, 
it makes sense to write the to x m matrix Gh by blocks of size depending on I and to — I: 



G 



H 



G 



H G 
Im-Z G 



(2) 
H 
(4) 
H 



(6) 



where G^ is an / x (m — I) matrix, g2 is an I x I matrix, I r 
matrix, and finally G^ is an (m — I) x I matrix. 
Requirements on the matrix Gh are: 



denotes the (to — Z) x (m — I) identity 



1. Invertibility. The matrix Gh should be an invertible matrix, so that the receiver can decode the 
wiretap encoding. 

2. Security. The matrix Gh should map [a||u] so that in the resulting vector each bit of data from a is 
affected by at least one random bit from u, to make sure that each bit of data is protected. 

3. Sparsity. Both the matrices Gh and G^ 1 should be as sparse as possible, in order to avoid too much 
computation and communication overheads. 

Since by (j4|), the m — / last rows of Gh form a generator matrix of a (to, to— I) error correction code C in 
systematic form, it has rank to — I. The first I rows are then obtained by adding linearly independent vectors 
not in C, thus completing a basis of {0, l} m , resulting automatically in an invertible matrix. A simple way 



to do so is to choose G 



(i) 
H 



Oix(m-i) and G H 



I;, so that ([6]) becomes 



G H = 







( X (m — l) 
-1 



G 



(4) 
H 



Since 







lx(m—l) 



G 



(4) 
H 



[u, a + uG H 



(4)i 



and Gjp has no column with only zeroes (it is a block of an error correction code) , we have that indeed each 
bit of data from a is affected by at least one random bit from u. 



The choice of G 



(i) 

H 



i (2) 

0/x(m-i) and G H 



Ii makes the I first rows of G# as sparse as possible. 



Example 1 Take to = 4, I = 2 so that to — I = 2, and 



G H = 



G 



(i) 



H 
h 



r (2) 
p(4) 



1 




Clearly Gh is invertible. The error correction code described by rows 3 and 4 is simply the repetition code. 



3 Security against a Passive Adversary 

This section analyzes the security of the proposed scheme against a passive adversary, that is an adversary 
limited to monitoring and recording communications. The system we consider already uses a keystream 
generator to protect the confidentiality of the data. Thus though a passive adversary may try to still 
discover confidential messages, more dangerous is an attack against the secret key, which would endanger 
all the transmissions. Based on what a passive adversary can do, this means mounting a known plaintext 
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attack in order to recover the secret key. In the passive known plaintext attacking scenario, with no enhanced 
security, the adversary possesses the pair 

(plaintext, noisy ciphertext) = (a, z = Cecc{&) © x © v), 

from which he calculates 

C E cc{&) © z = x© v. 

He can then use x © v for further processing in an attempt to recover the key which generated x. We will 
show how the introduction of the wiretap encoding increases the protection of the key against such attacks. 
In what follows, we use as notation that 

• m, random bits used in the wiretap encoder, 

• Xi, output bits of the keystream generator, 

• Vi, random components of the additive noise 

are realizations of certain random variables Ui, Xi and Vi, respectively, i = 1,2, n. We can further assume 
that the plaintext is generated randomly, and thus see ai as a realization of a random variable Ai as well. 
The corresponding vectors of random variables are denoted as follows: A' = L4i]f =1 , XJ m_i = [Ui]™^, 
X" = [Xtf =1 , and V" = [Vi\U- 

Recall from ([3]) and ([5]) that the received vector at the receiver is given by 

z = Cecc {Ch (a| |u)) ffixffiv 
= [a||u]G©x©v 

where G = [gi.j]™ i ™ =1 is an m x n matrix containing both the wiretap and the error correction encoding. 
Let z = so that z can be written componentwise as 



m-l 

:. 

fc=l k=l 



((® 9k,ia-k) © (0 gt+k,iUk) © Xi) @Vi, i= 1, 2, n, 



and Zi appears as the realization of a random variable 

l m-l 
Z i = ((0 9k t iM) © (0 9i+k,iUk) © Xi) © Vi, i= 1, 2, n. 

k=l k=l 

We further denote Z" = [Z,]™ =1 , and 

Z n = C , BCC (CH(A'||U m "'))ffiX"ffiV ri . (7) 

From ©, we have 

C ff (A ; ||U m -') = [A l ,XJ m - l ]G H 
= [A',U m -'] 



r (i) p. (2) 



t , r (4) 

.(/-"f 1 ) A lr^^ 2 h _1_ l"TT m — ' TTin-Ip(4)l 



= [A'G^.A'G^] + [u m -',U m -'G^], 
and we can rewrite the wiretap encoder as 

C H (A l \\lJ m - 1 ) = C H , a (A l ) © C H , u (XJ m - 1 ), 
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where Cn, a and Ch,u are the operators for the wiretap encoding restricted to a, resp. u: 

C H , a (A- 1 ) = [A'GW.A'Gg 5 ], C HiU (V m - 1 ) = [U m -',U m -'Gg } ]. 
Since the error correcting encoding is also linear, we finally get 

Z n = C EC c{ChA^ 1 )) © C EC c{C H AV m - 1 )) 8 X" © V™. (8) 

The lemma below gives a bound on the resistance of the scheme to a known plain text attack where the 
adversary knows the pair (a, z) . 

Lemma 1 The equivocation of the keystream output knowing the plaintext and the received signal can be 
lower bounded as follows: 

H{X n \A l ,Z n ) > 

mm{H(U m - l ) 7 H(X n )+H{V n )} + 
mm{H(V n ), H(X n )} - S(C E cc), 

where 

S(C ECC ) = H(e) + e\og(2 m - 1 - 1) 
-> 

since e — >• 0. 

Proof. Employing the entropy chain rule, we have that 

iJ(A',U" l -',X",V™,Z") 
= H(A l ) + H(Z n \A l ) + H(V m - l \A l ,Z n ) + 

H(V n \A\ U m ~ z , Z") + H(X n \A l , V m ~ l , V", Z") 
= H(A l ) +H(Z n \A l ) + H(U m - l \A l ,Z n ) 

+H{V n \A l 7 V m ~ l ,Z n ), 

since H(X n \A l , U" 1 ^, V", Z") = 0, using that X" = C E cc(C H (A* 1 |U m -')) © Z" © V" from ©. 
Repeating the entropy chain rule but with another decomposition, we further get that 

H(A l ,V m - l ,X n ,V n ,Z n ) 
= H(A l )+H(Z n \A l ) + H(X n \A l ,Z n ) + 

H{\5 m ~ l \A l , X n , Z n ) + H(V n \A l , U m - 1 , X", Z n ) 
= H(A l ) + H(Z n \A l ) + H(X n \A l ,Z n ) 

+H(U m - l \A l ,X n ,Z n ), 

noticing that H(V n \A l , IP"-*, X™, Z") = using again V" = CECc(C H (A l \\V m - 1 )) © Z" © X" from ©. 
By combining the two decompositions, we deduce that 

i?TX n |A<,Z n ) 
= i/(U m -'|A',Z n ) +H{V n \A\\5 m -\Z n ) 
-iJ(U m -'|A',X",Z"). 

We now reformulate H{\J m ~ l \A\ Z") and H(V n \A l ,V m ~ l ,Z n ). First, using this time ©, we have that 
C E cc{C H AV m - 1 )) = C E cc(C H .a{A 1 )) © Z" © X™ © V". Since C BCC and C ff are invertible, note that 
H(CECc(C H AV m - 1 ))) = H(U m - 1 ), so that 

ff(U m -'|A',Z n ) =H(X n ®V n ). 
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On the other hand, conditioning reduces entropy, namely, 

#(U m -'|A',Z") < H{U m - 1 ), 

and in order to make explicit the role of the extra randomness brought by the wiretap encoder, we can write 
that 

H(V m - l \A l ,Z n ) = mm{H(V m - l ),H(X n ©V")} 

= min{ff(U m -'), J ff(X n ) +H(V n )} 

since X ra and V ra are mutually independent. 

Similarly, again using © to get that V™ = C E cc{G H A^ m ~ 1 )) © C E cc(ChA a1 )) © Z™ © X" and 
combining with 

H(V n \A l ,U m - l ,Z n ) < H(V n ), 

we obtain that 

H(V n \A l ,V m ~ l , Z n ) = min{ff(V"), ff(X")}, 

which distinguishes the randomness coming from the channel noise and the keystream entrpy. 

We are finally left with bounding H(XJ m ~ L \A l , X™, Z n ). Recovering XJ m_z when A ( , X™ and Z n are given 
is the decoding problem of removing the noise V™ employing the code C ecc with error probability P e . This 
can be bounded using Fano's inequality: 

H(U ro -'|A',X",Z n ) < H(P e ) + P e log(2" 1 -' - 1) 
< H(e) + elog(2 m -' - 1) -> 

since by design of the system, we may assume P e = e — > 0. This concludes the proof. _ 

The interpretation of the lemma is a bound on the resistance of the scheme to a passive known plain text 
attack. This clearly depends on two parameters: 

• the keystream generator: if the output of the keystream generator has a very high entropy i?(X n ) > 
H{\J m -\ V") = H(\J m -') + i?(V"), then the lemma tells that 

ff(X n |A', Z") > H(U m - 1 ) + H{V n ) - 5{C E cc)- 

• the pure randomness put in the wiretap encoder: if we do not add it in the system, the lemma shows 
that 

iJ(X™|A ; ,Z") > H(V n ) 
that is the information-theoretic security of the keystream depends on the channel noise. 
We illustrate this last claim with an example. 

Example 2 Consider the case of a known plaintext attack when a = 0. We then have 

m—t 

Zi = Xi © ((J) gi+k,iUk) @vu i = 1,2, ...,n. 
fc=i 

Without the wiretap encoding, the keystream Xi is corrupted and so protected as well by the noise on the 
channel, while with addition of the wiretap encoder, it is further protected by the pure randomness added. 

The special case where the channel is noisefree is detailed in the corollary below. This further illustrates 
the effect of pure randomness involved in the wire-tap channel coding. 
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Corollary 1 In a noisefree channel, we have 

H(X n \A l ,Z n ) > mm{H(U m - l ),H(X. n )}. 

Proof. Since the channel is noisefree, V = and consequently H(V) = P e = 0. Lemma[T]can be rewritten 
as 

H(X n \A l ,Z n ) 
> min{H{U m - l ),H{X n )} + 

min{0,H(X n )}-S(C E cc) 
= min{H(U m - l ),H(X n )}. 



So far, we have discussed the security of a given keystream generator output, for one instance of trans- 
mission. We now move to a more realistic scenario. Transmission takes place over time t = 1,2,..., 
and the keystream generator uses a secret key (or just a key) K based on which it computes its outputs 
X^) = [X\ P in a deterministic way depending on / for a time period of length r: 

X « =XW(K) = / (t) (K), t = l,...,r. 

Note that / (t) (K) is an expansion of the secret key K via a finite state machine and can be considered as 
an encoding of |K| bits into a long binary codeword. Correspondingly, we can rewrite the whole system in 
terms of realizations of random variables that depends on time, over the time interval t = 1, . . . r: 

• AM = [A? >]f =1 for the plain text, 

• U« = [U? } ] TJi for the pure randomness used in the wiretap encoder, 

• v(*) = for the channel noise, 

• = [Z^]f =1 for the received signal. 
Similarly as above, we have 

ZW = C BC c(C H , a (A®) © C H ,„(UW)) © /W(K) © V«. 

The key K is represented as a vector of random variables drawn independently from a uniform distribution 
over {0, 1}, so that H(K) = |K|. We further use the following block notations: 

A Tl = [AW||A^||...||A( T )] 

Xjr(m-l) = [U( 1 )||U (2) ||...||U (T) ] 

V Tn = [V (1) ||V (2) || . . . ||V (r) ] 
Z Tn = [Z (1) ||Z (2) ||...||Z (r) ]. 

We can now state the main theorem of this section, which describes the security of the enhanced system 
against a passive adversary regarding the secret key recovery. 

Theorem 1 When Pr(V; (j) = 0) ^ Yi(V- J) = 1) ^ 1/2, i = 1, 2, n, j = 1, 2, t, there exists a threshold 

Tthres Such that 

> for T < Tthr 
-» for T > Tthr 



H(K\A Tl ,Z 7 



ires 
hres 
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Proof. When r = 1, X« = X" = / (1) (K) and accordingly #(XW) = H(K), thus Lemma □ directly 
implies that H(X. n \A l , Z") = H(K\A l , Z") > is achievable. 
When r > 1 grows, we employ the following analysis. 

By using two different decompositions of H(A Tl , JJ r ( m ~ l \ X rn , V r ™, Z Tn ) via the entropy chain rule as 
done in Lemma [TJ we get 

H(K\A Tl ,Z Tn ) 
= H(U T{m - l) \A Tl ,Z Tn ) + H(V Tn \A Tl ,U T{m - l) ,Z Tn ) 

-i7(U T(m - () |A Ti ,K,Z r "). (9) 

Note that knowing A Tl , Z Tn can be considered as a rn-length degraded version of a binary codeword with 
r(m — t) + |K| information bits which is corrupted by a noise vector V Tn . Indeed, without knowing the 
key, decoding \J T ^ m ~ l ) is not possible, so the adversary also needs to try to decode K. Assuming that the 
decoding error probability of this code is P*, Fano's inequality implies that 

#(U T (™-')|A T ',Z Tn ) < iJ(U T (" l -'\K|A r ',Z T ") 

< H(P*) + P* i og (2^ m -^)+l K l - l) . 

Combining the decoding ability of Cecc with a minimum distance decoding yields a decoding error for the 
aggregated code of size 2 T ( m ~ £ )+l K l that tends to zero provided long enough codewords, that is P* — > 0, and 
accordingly H(\5 T ^ m ~ l ^\A Tl , Z Tn ) — s> when r is large enough. 
In a similar manner and employing 

ff(V T "|A Tl ,U T(m - ! » Z Tn ) < iJ(V rn ,K|A T ',U r(m - () ,Z T "), 

the decoding ability of Cecc with a minimum distance decoding as used above implies that H(y rn \A Tl , TJ T ( m ^ l \ Z Tn ) — > 
when r is large enough. 

To take care Z rn ), we again use a decoding argument, since Z rn is known. However, 

it is important to note here that K is known too. Thus even though we look at a block 

Ur(m-i) = [u(i)||u(2)|| ||U (r) ], 

the knowledge of K makes each block TJW independent, and thus we can decode each of them separately 
and the probability of error is PJ". Fano's equality finally yields 

jy(U r < n *-0|A T, ,K,Z rn ) < ^(PJ) + P ( riog(2 T (" 1 -^ - 1) 

< H{e T ) + e T \og{2 T{m - 1) - 1) 

and 

H (ljr(rn-l) \ A rl^ K ^ ji™) -> (10) 

since P e = e — > by design of Cecc- 

The above consideration of the cases r = 1 and r > > 1 also implies the existence of a threshold T t f lres . 

■ 

The statement is intuitively clear. The security depends on the length |K| of the key noting that this 
length is fixed in the system. Accordingly, when the keystream generator is used for a period r that varies, as 
long as r < Tthresh, the key is protected by the randomness of the noisy channel and of the wiretap encoder, 
but that protection cannot last forever if the adversary collects too much data. 

Note that all this analysis is true for "realistic channels" where the noise is not uniformly distributed. The 
uniformly distributed noise in the communication channel makes error-correction infeasible, which explain 
the assumption in the above theorem. 

Theorem Q] directly implies the following corollary for noiseless channels. 
Corollary 2 When V T " = and the parameter t is large enough we have: 

H(K\A Tl ,Z Tn ) = . (11) 
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4 Security against an Active Adversary 

We now consider an active and therefore more powerful adversary. There are many possible scenarios for an 
active adversary. We assume in this work that 

1. he can modify the data on the communication channel, that is, inject controlled noise, 

2. he can learn the effect of the modified channel at the receiving side, by listening to the feedback link 
that tells whether decoding was successful. 

Let us be more precise. While the transmitter sends 

y = C E cc{C H (a\\u)) ©x 
in an already security enhances setting ([5]), the receiver sees its noisy version 

z = y © v. 

The active adversary is allowed to inject some extra noise v* over the channel, so that now, the legitimate 
receiver sees yffi v', where v' contains both the noise v coming from the channel and the noise v* controlled 
by the adversary: 

z = y © v © v* 

= C E cc(Ch(h\\u)) ffixffi v©v*. (12) 

As earlier (Section , the receiver first decrypts its message using its secret key and locally generated 
keystream 

z ©x = C E cc (Cff(a| | u)) © v© v* 

and then try to decode z © x: 

C^ c (zffix) = C^cc ( C ecc (C h (a| |u)) ©vffiv*) 
= C H (a\\u) 

under the assumption that the error correcting code can correct the errors introduced by v, so as to get 

a = C ff 1 (C^ c (z©x)). 

Because of the extra noise v*, the probability of decoding correctly at the receiver may decrease. In the 
meantime, the active attacker can listen to the feedback channel so that he knows whether the decoding 
failed or was successful. His goal is again to find the key. His strategy then consists in adding different noise 
vectors v* and to observe the feedback channel to see whether the chosen noise made the decoding fail, in 
order to gather information. 

We keep our earlier notation, that is 

• %n, random bits used in the wiretap encoder, 

• Xi, output bits of the keystream generator, 

• v' i} random components of the additive noise v' = v © v*, 

• at, bits of the plain text, 

• Zi, bits of the received message 
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are realizations of certain random variables E/j, JQ, V(, Vi, Vj*, Z^, i — l,2,...,n and Ai, i = 

The corresponding vectors of random variables are denoted as follows: A' = XJ m_i = [E/i]™Tj^, 

X" = [Xi]? =1 , V" = [y/]f =1 , V" = [Fi]^, V*" = [VT]r =1 , and Z" = [Ztf =l . Similarly to ©, 

Z" = C £C c(C ff , a (A')) © C £C c(^, u (U m -')) © X" © V" © V* n . (13) 

Finally, let fd be a binary flag which indicates whether the decoding result is indeed a or has failed, and 
accordingly fd can be considered as a realization of a binary random variable F d . 

The lemma below gives a bound on the resistance of the scheme to an active attack where the adversary 
not only controls the noise but also knows a, z and fd- 

Lemma 2 The equivocation of the keystream segment knowing the plaintext, the received signal, and the 
decoding tag, can be lower bounded as follows: 

H{X n \A l ,Z n ,F d ) > 

mm{H(V m - l ),H(X n ) + H(V n \F d )} + mm{H(V n \F d ), H(X n )} - S(C ECC ), 

where 

5{Cecc) = H(P e ) + P e log(2" 1 -' - 1) -> 0, 

since P e — > 0. 

Proof. As in Lemma [T] we start with two different chain rule decompositions of the same joint entropy. 
On the one hand, 

H(A l , U" 1 -' , X" , V" , Z n , F d ) 
= H{A l ) + H{Z n \A l ) + H(U m - l \A l , Z") + 

H{F d \A\\J m -\Z n ) +H(V' n \A l ,U m -\Z n ,F d ) + 

H{X n \A l , U m ~ z , V™, Z™, F d ) 
= H{A l )+H{Z n \A l )+H(U m -'\A l ,Z n ) 

+H(V' n \A l ,U m - l ,Z n ,F d ), 

since from (jTSJ) we have that X™ = C E cc(C H (A l \\U m ~ l ))®Z n (SV' n implying ff(X u |A l , \J m ~ l , V' n , Z n ,F d ) = 
0, and H{Fd\A l , TJ m ~ l , Z n ) = 0, since knowing A ( and Z ra , decoding can be performed on Z and the decoded 
value can be compared to A ( , yielding Fd- 
On the other hand, 

H(A l , IP™-', X", V™, Z n , F d ) 
= H{A<) + H(Z n \A l ) +H(X n \A l ,Z n ) + 

H(F d \A l ,X n ,Z n ) + H(V m - l \A l ,X n ,Z n ,F d ) + 

H(V' n \A l ,V m - l ,X n ,Z n ,F d ) 
= H(A l ) +H(Z n \A l ) + H(X n \A l ,Z n ) 

+H{U m - l \A l ,X n ,Z'\F d ), 

noticing that H(V' n \A l , IP""', X", Z", F d ) = 0, again using from ^) that V" = C EC c(C H (A 1 1 |U m -')) © 
Z n © X™, and that H(F d \A l , X rn ' 1 , Z") = for the same reason as above. 
By combining the two decompositions, we deduce that 

iT^IA'.Z") = H{\J m - l \A l ,Z n ,F d ) +H(V' n \A l ,U m - l ,Z n ,F d ) - H(U m - l \A l ,X n ,Z n ,F d ), (14) 

where 

H(V m - l \A l ,Z n ,F d ) = mm{H(U m - l ),H{X n © V' n \F d )} 
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since C E cc(Ch (U m " ; )) = C E cc(C H (A 1 )) © Z™ © X" © V" implies that 

ff(U m -'|A',Z n ,F d ) = ff(X" © V'"|F d ) 
and conditioning reduces entropy, namely, 

H(U m - l \A l ,Z n ,F d ) < HiU" 1 - 1 ). 
Similarly, again using (|13[) and that 

H{V' n \A l 1 U m -\Z n ,F d ) < H(V' n \F d ), 

we obtain that 

i?(V'"|A',U m - / ,Z",F d ) = mm{H(V /n \F d ),H(X n )}. 
To summarize, Equation (|14[) is now given by 

iJ(X™|A ; ,Z") 

= mm{H(V m - l ),H(X n (BV' n \F d )}+mm{H(V' n \F d ) 7 H(X n )} 
-H(V m - l \A l ,X. n ,Z n ,F d ) 

where V'" = V" + V*™ and V*™ is known to the adversary, so that we in fact have 

H(X n \A l ,Z n ) 

= mm{H(V m - l ),H(X n ) + H(V n \F d )} + min{H(V n \F d ), H(X n )} 
-H(U m - l \A l ,X n ) Z n ,F d ) 

using further that X™ and V n are mutually independent. 

We are finally left with bounding ff(U m_i |A i , X n , Z n , F d ). For the adversary, recovering U m 1 when 
A', X™ and Z™ are given is the decoding problem of removing the noise V™ (he knows V*") employing the 
code Cecc with error probability P e . This can be bounded using Fano's inequality: 

H(U m - l \A l ,X n ,Z n ,F d ) = H(U m - l \A l ,X n ,Z n ) 

< H{P e ) + P e log(2 m -' - 1) 

since knowing whether the receiver could decode the worst noise does not affect the error capability of Cecc- 
This concludes the proof. g 

Let us compare the result of Lemmas [1] and [2j 

#(X"|A', Z") > mw{H(U m - 1 ), H(X n ) + H(V n )} + mm{H(V n ), H (X")} - S(C E cc), 
H(X n \A l ,Z n ,F d ) > mm{H(V m - l ),H(X n )+H(V n \F d )} + mm{H(V n \F d ),H(X n )}-6(C E cc), 

where 

8{C E cc) = H(P e ) + P e log(2" 1 -' - 1) -> 0, 

since P e — > 0. As expected, the equivocation in the case of an active adversary is smaller than for a passive 
adversary, since H(V n \F d ) < H(V n ). 

Based on the above, we easily get a counterpart of Theorem Q] for the case of an active adversary. 

Theorem 2 When Pr^ ' = 0) ^ Pr(V^ (j) = 1) 1/2, i = 1,2, ...,n, j = l,2,...,r, there exists a threshold 

Tthres,act SUch that 

H(K\A Tl ,Z Tn , F d ) { >{ \ T < Tthres ' act 
y 1 ' ' ; \ -> for r > T threS!act . 

We have that Tthres,act < Tthres, the threshold for a passive adversary. 
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Proof. When r = 1, XW = X" = / (1) (K), Lemma Q] directly implies that H(X n \A\ Z", F d ) = 
H(K\A l , Z", F d ) > is achievable. 

When r > 1 grows, we know from © that 



Now it is shown in the proof of Theorem [T] that every term tends to zero, using a decoding argument, which 
will hold similarly here, since the knowledge of F d cannot make the decoding more difficult. _ 



5 Practical Implications and Applications Issues 

This section provides a generic discussion of the usefulness and possible applications of the proposed approach. 

5.1 Implications of the security evaluation 

The analysis given in Sections [3] and 0] shows that in systems where the encoding-encryption paradigm is 
employed, involvement of pure randomness via concatenation of dedicated wire-tap and error-correction 
coding (instead of error-correction only) provides an increased cryptographic security, by combining pseudo- 
randomness, randomness and coding, which in a known-plaintext cryptanalytic scenario implies an increased 
resistance against threats on the secret key. 

The performed information-security evaluation more precisely points out the following desirable security 
properties of the proposed approach: (i) When the sample available for cryptanalysis is below a certain size, 
the scheme provides uncertainty about the secret key; (ii) Complexity of the secret key recovery appears as a 
highly computationally complex problem even if the available sample is such that the posterior uncertainty 
about the secret key tends to zero. The main consequence of (i) is that even if exhaustive search were to 
be employed for the secret key recovery, a (large) number of candidates will appear. The statement (ii) is 
an implication of the proofs of Theorems 1 and 2, where the reduction to zero of the posterior uncertainty 
about the secret key appears assuming employment of a decoding which has complexity proportional to the 
exhaustive search over all possible secret keys. Accordingly, the uncertainty tends to zero at the expense of 
a decoding with exponential complexity. Actually, the decrease of the uncertainty about the secret key with 
the increase of the sample available for cryptanalysis appears as a consequence of decoding capabilities of a 
low rate random binary block codes, but at the expense of the decoding complexity which is exponential in 
the secret key length. 

The above features (i) and (ii) hold not only in a passive attacking scenario where the attacker per- 
forms cryptanalysis based on recording the ciphertext from a public communication channels, but also in 
certain active attacking scenarios where the attacker can modify the ciphertext and learn the effects of these 
modifications. 

5.2 Framework for applications 

The encoding-encryption paradigm for secure and reliable communications enjoys the following desirable 
properties: (i) When the decryption is performed by bitwise XORing the keystream to the ciphertext, an 
error in a bit before decryption causes an error in the corresponding bit after decryption, without any error- 
propagation, and (ii) Provides non-availability of the error-free keystream when the communication channel 
is a noisy one. 

The proposed approach for enhancing the security of the communications systems which follow encoding- 
encryption paradigm could be employed in the design of these systems from scratch as well as in upgrading 
of the existing ones. 



H(K\A Tl ,Z Tn ,F d ) 
H(V T ( m - l) \A Tl ,Z Tn ,F d ) + H(V Tn \ 
_ H (Jjr(m-l) \a t1 ,K, Z Tn , F d ). 



A Tl ,V T{m - l) ,Z Tn ,F d ) 



(15) 
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In the case of upgrading the existing systems, the implementation assumption is that the employed, 
already existing, binary linear block error-correction code (m, n) which encodes m bits into a codeword from 
GF(2 n ), could be replaced with a binary block code (m 1 , n) with the same error correction capability but with 
m' > m. Accordingly, m! — m random bits can be concatenated with m information bits and mapped into 
the new m-bits via a homophonic encoder. The obtained output from homophonic encoder is the input for 
the error-correcting one. Taking into account the notation from Section 2, the previous means that instead 
of performing CECc( a ) which is a linear mapping {0, l} m — » {0, 1}™, the following should be performed: 
Cecc(Ch(&\ l u )) where a||u is a concatenation of an m-dimensional vector and an m' — m-dimensional one, 
Ch(-) is a linear mapping {0,1}™' ->■ {0,1}™' and C E cc{-) is a linear mapping {0,1}™' -> {0,1}™. On 
the receiving side, the decoding procedures after decryption are straightforward (see Fig. 2): The error 
correction decoding removes the random errors, and the message a is obtained by truncating of the inverse 
linear mapping corresponding to the homophonic decoding. 

In the case of a design of the encoding-encryption system from the scratch, the design should include a 
coding box which performs the concatenation of homophonic and error-correction coding in a manner which 
fits the rate of the concatenated code to the given constraints. 

Note that from an implementation point of view, replacement of a linear block encoding by a concate- 
nation of a block linear homophonic and error correction encoding is a replacement of one binary matrix 
with another binary matrix which is the product of the matrices corresponding to the homophonic and 
error-correction encoders. Accordingly, the implementation complexity of two concatenated codes could be 
approximately the same as the implementation complexity of an error correcting code only. 

6 Conclusion 

The problem addressed in this paper is the one of enhancing security of certain communications systems 
which employ error correction encoding of the messages and encryption of the obtained codewords in or- 
der to provide both secrecy and reliability of the transmission. This paper yields a proposal for provid- 
ing the enhanced security of the considered systems employing randomness and dedicated coding and the 
information-theoretic security evaluation of the proposed approach. The analysis given in this paper implies 
that in the systems where the encoding-encryption paradigm is employed, the cryptographic security can 
be enhanced via involvement of a homophonic coding based on pure randomness as follows: Instead of just 
error-correction encoding before the encryption, this paper proposes employment of a concatenation of linear 
block homophonic and error-correction encoding. The proposal and its cryptographic security evaluation are 
given in a generic manner and accordingly yield a generic framework for particular applications. 

Note that the information-theoretic consideration of the cryptographic security yields a basic evaluation 
of the related cryptographic features. On the other hand, the information-theoretic security evaluation also 
provides specification of the settings when it is possible to perform the secret key recovery, but it does not 
specify and only indicates the expected complexity of this problem. We show that with the aid of a dedicated 
wire-tap encoder, the amount of uncertainty that the adversary has about the key given all the information 
he could gather during different passive or active attacks he can mount, is a decreasing function of the 
sample available for cryptanalysis. This means that the wire-tap encoder can indeed provide an information 
theoretical security level over a period of time, after which a large enough sample is collected and the function 
tends to zero, entering a regime in which a computational security analysis is needed. 

An interesting issue for further work is the characterization of the transition region in which the uncer- 
tainty drops from a certain value to close to zero. Also, because after all, the uncertainty tends to zero, the 
computational complexity based evaluation of cryptographic security is a direction for a future work. 
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